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Abstract 

In  a  recent  paper  of  Mateev  et  at.  (2001),  a  new  technique  for  program  analysis 
called  fractal  symbolic  analysis  was  introduced  and  applied  to  verify  the  correctness 
of  a  series  of  source-level  transformations  for  cache  blocking  in  LU  decomposition 
with  partial  pivoting.  It  was  argued  in  that  paper  that  traditional  techniques  are  inade¬ 
quate  because  the  transformations  break  definition-use  dependencies.  We  show  how 
the  task  can  be  accomplished  purely  equationally  using  Kleene  algebra  with  tests. 


1  Introduction 

Kleene  algebra  (KA)  is  the  algebra  of  regular  expressions.  It  was  first  introduced  by 
Kleene  [9]  and  further  developed  by  Conway  [5],  Kleene  algebra  has  appeared  in  one 
form  or  another  in  relational  algebra,  semantics  and  logics  of  programs,  automata  and 
formal  language  theory,  and  the  design  and  analysis  of  algorithms.  Many  authors  have 
contributed  over  the  years  to  the  development  of  the  algebraic  theory;  see  [11]  and  refer¬ 
ences  therein. 

Kleene  algebra  with  tests  (KAT),  introduced  in  [11],  combines  programs  and  as¬ 
sertions  in  a  purely  equational  system.  Simply  stated,  a  Kleene  algebra  with  tests  is  a 
Kleene  algebra  with  an  embedded  Boolean  subalgebra.  KAT  strictly  subsumes  propo¬ 
sitional  Hoare  Logic  (PHL),  is  of  no  greater  complexity  than  PHL,  and  is  deductively 
complete  over  relational  models  (PHL  is  not)  [14,  4,  12,  15].  KAT  is  less  expressive  than 
propositional  Dynamic  Logic  (PDL)  [7],  but  in  the  current  state  of  complexity-theoretic 
knowledge1  it  is  also  less  complex.  Moreover,  KAT  requires  nothing  beyond  classical 
equational  logic,  in  contrast  to  PHL  or  PDL,  which  depend  on  a  more  complicated  syn¬ 
tax  involving  partial  correctness  assertions  or  modalities. 

KAT  has  been  applied  successfully  in  a  number  of  low-level  verification  tasks  in¬ 
volving  communication  protocols,  basic  safety  analysis,  concurrency  control,  and  local 

'specifically,  unless  PSP  ACE  =  EXPTIME 
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compiler  optimizations  [2,  3,  13].  A  useful  feature  of  KAT  in  this  regard  is  its  ability  to 
accommodate  certain  basic  equational  assumptions  regarding  the  interaction  of  atomic  in¬ 
structions.  This  feature  makes  KAT  ideal  for  reasoning  about  the  correctness  of  low-level 
code  transformations. 

In  this  paper  we  report  on  the  use  of  KAT  in  a  substantial  compiler  verification  task. 
Mateev  et  al.  [16]  have  described  a  series  of  source-level  transformations  for  automatic 
cache  blocking  in  LU  decomposition  with  partial  pivoting.  These  transformations  are 
used  primarily  in  large  applications  to  enhance  locality  of  reference.  In  attempting  to 
verify  the  correctness  of  these  transformations,  Mateev  et  al.  observed  that  the  standard 
approach  involving  symbolic  dependence  analysis  is  inadequate.  The  major  complica¬ 
tion  is  that,  although  the  transformations  are  semantically  correct,  they  do  not  preserve 
definition-use  dependencies.  This  led  them  to  consider  other  approaches  that  exploit 
knowledge  of  the  semantics  of  the  basic  operations.  They  proposed  a  new  system  called 
fractal  symbolic  analysis,  in  which  programs  are  repeatedly  simplified  until  symbolic 
analysis  becomes  feasible.  The  semantics  is  not  preserved  in  the  simplification  process, 
but  the  equality  of  the  simplified  programs  implies  the  equality  of  the  original  programs. 

In  this  paper  we  demonstrate  that  the  same  verification  task  studied  by  Mateev  et 
al.  can  be  adequately  handled  by  KAT  in  a  purely  equational  way.  The  semantics  of 
the  underlying  domain  of  computation  are  incorporated  only  as  Boolean  axioms,  as  in 
Hoare  logic.  The  code  transformations  themselves  are  purely  schematic.  The  atomic- 
level  code  transformations  are  instances  of  a  small  set  of  basic  schematic  rules  governing 
the  interaction  of  atomic  programs  and  tests.  These  rules  play  roughly  the  same  role  as 
the  assignment  rule  in  Hoare  Logic,  but  are  more  versatile.  All  other  transformations  are 
instances  of  theorems  of  KAT. 


2  Kleene  Algebra  and  Kleene  Algebra  with  Tests 

Kleene  algebra  was  introduced  by  S.  C.  Kleene  [9]  (see  also  [5]).  We  define  a  Kleene 
algebra  (KA)  to  be  a  structure  ( K ,  0,  1),  where  (K,  +,  •,  0,  1)  is  an  idempotent 

semiring,  p*q  is  the  least  solution  to  q  +  px  <  x,  and  qp*  the  least  solution  to  q  +  xp  <  x. 
Here  “least”  refers  to  the  natural  partial  order  p<qfip  -  q  =  q-  The  operation  +  gives 
the  supremum  with  respect  to  <.  This  particular  axiomatization  is  from  [10]. 

We  normally  omit  the  •,  writing  pq  for  p  •  q.  The  precedence  of  the  operators  is 
*>•>+.  Thus  p  +  qr*  should  be  parsed  p  +  (q(r *)). 

Typical  models  include  the  family  of  regular  sets  of  strings  over  a  finite  alphabet,  the 
family  of  binary  relations  on  a  set,  and  the  family  of  n  x  n  matrices  over  another  Kleene 
algebra. 

The  following  are  some  elementary  theorems  of  KA. 


p* 

=  i  +  pp*  =  i  +  p*p  =  p*p*  =  p** 

(1) 

p(qp)* 

=  (pq)*P 

(2) 

p*(qp*)* 

=  (p  +  q)*  =  (p*q)*p* 

(3) 

O 

II 

Q_ 

CT 

(p  +  q)*  =  PV 

(4) 

px  =  xq 

>k  ;k 

— )■  p  x  =  xq 

(5) 
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The  identities  (2)  and  (3)  are  called  the  sliding  rule  and  the  denesting  rule,  respectively. 
These  rules  are  particularly  useful  in  program  equivalence  proofs.  The  property  (5)  is  a 
kind  of  bisimulation  property.  It  plays  a  prominent  role  in  the  completeness  proof  of  [  1 0] . 
We  refer  the  reader  to  [10]  for  further  definitions  and  basic  results. 

A  Kleene  algebra  with  tests  (KAT)  [11]  is  aKleene  algebra  with  an  embedded  Boolean 
subalgebra.  More  precisely,  it  is  a  two-sorted  structure  (K,  B ,  +,  •,  *,  0,  1),  where  - 

is  a  unary  operator  defined  only  on  B,  such  that  B  C  K,  ( K .  +  ,  •,  *,  0,  1)  is  a  Kleene 
algebra,  and  ( B ,  +,  •,  0,  1)  is  a  Boolean  algebra.  The  elements  of  B  are  called  tests. 

We  reserve  the  letters  p,  q,  r,  s, . . .  for  arbitrary  elements  of  K  and  a,  b,  c, . . .  for  tests. 

When  applied  to  arbitrary  elements  of  K,  the  operators  + ,  • ,  0 , 1  refer  to  nondetermin- 
istic  choice,  composition,  fail  and  skip,  respectively.  Applied  to  tests,  they  take  on  the 
additional  meaning  of  Boolean  disjunction,  conjunction,  falsity  and  truth,  respectively. 
These  two  usages  do  not  conflict;  for  example,  sequentially  testing  b  and  c  is  the  same  as 
testing  their  conjunction  be. 

The  encoding  of  the  while  program  constructs  is  as  in  PDL  [7].  The  conditional 
test  if  b  then  p  else  q  and  while  loop  while  b  do  p  are  expressed  as  bp  +  bq  and  (bp)  *b, 
respectively. 

The  propositional  fragment  of  Hoare  Logic  is  subsumed  by  KAT  [12].  The  Hoare 
partial  correctness  assertion  {b}  p  {c}  is  expressed  bpc  =  0,  or  equivalently,  bpc  =  bp. 


The  following  are  some  basic  theorems  of  KAT. 

bq  =  qb 

-> 

bq* 

=  ( bq)  *  b  =  q*b  =  b(qb)* 

(6) 

bq  =  bqb 

-> 

bq* 

=  (bq)*b  =  bq*b  =  b(qb)* 

(7) 

bp  =  pc 

-H- 

bp  = 

pc  «©  bpc  +  bpc  =  0. 

(8) 

A  proof  of  (8)  was  given  in  [1].  See  [11]  for  further  definitions  and  basic  results. 

For  applications  in  program  verification,  the  standard  interpretation  would  be  a  KA 
of  binary  relations  on  a  set  and  the  Boolean  algebra  of  subsets  of  the  identity  relation. 

2.1  Schematic  Reasoning 

KAT,  so  far  described,  is  propositional.  Programs  and  tests  are  interpreted  over  an  ab¬ 
stract  set  of  states,  and  programs  are  interpreted  as  abstract  binary  relations  or  traces.  In 
applications,  however,  we  must  instantiate  these  constructs.  This  involves  the  introduc¬ 
tion  of  symbols  for  variables,  constants,  functions,  and  relations  ranging  over  a  domain  of 
computation.  At  this  level,  a  state  of  the  computation  is  typically  taken  to  be  a  valuation 
of  the  program  variables  over  the  domain  of  computation,  and  state  changes  are  effected 
by  assignment  statements  x  :=  e,  where  x  is  an  individual  program  variable  and  e  is  a 
term.  This  extension  is  called  SKAT  (for  schematic  KAT)  and  was  investigated  in  [1], 

To  avoid  confusion  when  reasoning  at  this  level,  we  use  the  symbol  =  for  equality 
between  programs  (equality  in  KAT),  =  for  equality  in  the  underlying  domain,  :=  for 
assignment,  ©  for  addition  in  KAT,  and  +  for  addition  in  the  underlying  domain.  (When 
reasoning  purely  propositionally,  we  will  continue  to  use  the  symbols  =  and  +  for  equal¬ 
ity  and  addition  in  KAT.) 

Many  general  properties  can  already  be  derived  at  the  schematological  level,  without 
specifying  a  particular  interpretation  of  these  new  symbols.  For  example,  consider  the 
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following  identities: 

x  :=  s  ;  y  :=  t  =  y  :=  t[x/ s]  ;  x  :=  s  (y<0FV(s))  (9) 

x  :=  S j  y  :=  t  =  x  :=  s  ;  y  :=  f[x/s]  (x  0  FV(.s))  (10) 

x  :=  s  ;  x  :=  t  =  x  :=  £[x/s]  (11) 

</?[x/£]  ;  x  :=  t  =  x  :=  t  ]  p  (12) 

where  in  (9)  and  (10),  x  and  y  are  distinct  variables  and  FV(s)  denotes  the  set  of  variables 
occurring  in  s.  Special  cases  of  (9)  and  (12)  are  the  commutativity  conditions 

x  :=  s  ;  y  :=  t  =  y  :=  t ;  x  :=  s  (x  £  FV(f),  y  &  FV(s))  (13) 

ip;  x  :=  t  =  x  :=  t;  p  (x  ^  FV(<p))  (14) 

The  soundness  of  these  identities  over  all  schematic  interpretations  was  proved  in  [1], 
Another  valid  identity  not  considered  in  [1]  is 

x  :=  x  =  1.  (15) 


All  of  the  properties  (9)— (15)  make  sense  even  in  the  absence  of  equality.  When  equality 
is  present  in  the  language,  we  may  insert  any  valid  formula  of  the  theory  of  equality.  The 
following  property  is  also  valid: 

s  =  t ;  x  :=  s  =  s  =  t;  x  :=  t.  (16) 

It  follows  from  (15)  and  (16)  that 

x  =  t ;  x  :=  t  =  x  =  t.  (17) 

One  can  also  show  that  (10)  is  a  consequence  of  (12)  and  (16). 

In  traditional  Hoare  logic,  atomic  programs  are  assignments  x  :=  t  and  the  only 
atomic  assumption  is  the  assignment  rule 

{p[x/t]}x  :=t{p}, 

which  would  be  represented  in  KAT  by  either  of  the  two  equivalent  equations 

p[x/t]  ;  x  :=  t ;  p  =  <p[x/t]  ;  x  :=  t 

ip[x/t\  ;  x  :=  t ;  -\ip  =  0. 

These  equations  follow  from  (12).  In  fact,  (12)  is  actually  equivalent  to  two  applications 
of  the  Hoare  assignment  rule,  one  for  p  and  one  for  its  negation.  This  can  be  seen  by 
taking  b,  c,  and  p  to  be  p[x/t],  p,  and  x  :=  t,  respectively,  in  (8). 

To  illustrate  the  use  of  (9)— ( 14),  consider  the  equation  pqr  =  rqp,  where  p,  q,  and  r 
are  the  assignments  y  :=  x,  y  :=  2*  y,  and  x  :=  2  *  x,  respectively.  This  example  was 
used  in  [16]  to  illustrate  a  simple  transformation  that  is  sound,  yet  breaks  definition-use 
dependencies.  Here  is  an  equational  proof  in  KAT : 

pqr  =  y  :  x  \  y  :  2  *  y  ;  x  :  2*  x 

=  y  :=  2  *  x  ;  x  :=  2  *  x  by  (1 1) 

=  x  :=  2  *  x  ;  y  :=  x  by  (9) 

=  x  :=  2  *  x  ;  y  :=  2  *  y  ;  y  :=  x  by  (1 1) 

=  rqp. 
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2.2  Interpreted  Reasoning 

In  specific  applications,  we  are  not  constrained  to  reason  purely  schematically.  We  may 
take  advantage  of  the  fact  that  we  are  reasoning  with  respect  to  a  particular  interpretation 
or  class  of  interpretations  over  some  underlying  domain  or  class  of  domains,  and  deduc¬ 
tion  is  relative  to  the  theory  of  that  class  of  interpretations.  This  theory  determines  the 
Boolean  algebra  of  the  KAT  in  which  we  work.  A  valid  assertion  i p  takes  the  form  of  an 
equation  ip  =  1  in  KAT,  which  may  be  taken  as  an  extra  axiom  for  deductive  purposes. 

In  the  application  considered  in  this  paper,  variables  are  interpreted  as  integers  or 
reals,  +  is  interpreted  as  addition,  etc.  Valid  number-theoretic  properties  such  as  i  <  j -H- 
i  <  j  +  1  can  be  introduced  as  needed. 

Here  is  an  example  of  a  property  that  can  be  derived  at  this  level  involving  for  loops. 
If  i  and  n  are  integer  variables  and  p  and  i  <  n  commute  (for  example,  if  p  does  not 
assign  to  i  or  n ),  then 

i  <  n  ;  (i  <  n  ;  p  ;  *++)*  =  (i  <  n  ;  p  ;  *++)*  ;  i  <  n.  (18) 


Here  /++  is  an  abbreviation  for  *  :=  *  +  !.  Letting 


def  . 

c  =  i  <  n 


,  def  .  , 

d  =  i  <  n 


def 

q  =  p ;  *++> 


from  ( 12)  and  number  theory  we  have  c  <  d  and  cq  =  qd.  Once  we  have  established  these 
premises  that  are  particular  to  the  interpretation,  we  can  reason  purely  propositionally  in 
KAT  to  obtain 


c  <  d  A  cq  =  qd  — >  d(cq)*  =  (cq)*d, 
the  conclusion  of  which  is  (18). 


2.3  Arrays 

For  our  application,  we  will  need  to  extend  (9) — (17)  to  handle  arrays.  Theories  of  arrays 
have  been  considered  by  several  authors,  among  them  [18,  19,  6,  17].  Care  must  be  taken 
because  of  the  possibility  of  aliasing. 

Semantically,  an  array  variable  A  is  interpreted  by  a  valuation  as  a  map  D  — >  D, 
where  CD  is  the  domain  of  computation  (see  [8]).  An  array  assignment  A(s)  :=  t  maps 
valuation  6  to  where 

e'(A)(6(s))  =  0(t ) 

0'(A)(a )  =  0(A) (a),  a  ^  0(s) 

O' (x)  =  0(x),  x  any  array  or  individual  variable  not  equal  to  A. 

Some  of  the  extensions  we  will  need  are  sound  without  any  restrictions,  such  as 

s  =  t  ;  u  =  v  ;  A(s)  :=  u  =  s  =t;  u  =  v  ;  A(t)  :=  v  (19) 

However,  other  generalizations  which  may  seem  obvious  at  first  glance  turn  out  to  be 
unsound.  For  example, 

,4(s)  :=  t  =  A(s)  :=  t ;  A(s)  =  t 


5 


is  not  true  in  general,  even  if  t  is  a  constant.  Over  N,  if  s  is  ,4(2)  and  t  is  3,  and  the 
assignment  is  executed  in  a  state  in  which  ,4(2)  =  ,4(3)  =  2,  then  the  value  of  t  after  the 
assignment  is  still  3,  but  the  value  of  ,4(s)  is  2. 

Most  of  the  properties  we  will  need  are  consequences  of  the  following  metatheorem, 
which  allows  us  to  transfer  properties  without  arrays  to  properties  with  arrays.  Define  an 
expression  to  be  simple  if  it  contains  no  array  symbol. 

Theorem  2.1  Let  V  be  a  finite  set  of  individual  variables  and  let  A  be  an  array  variable. 
For  each  x  G  V,  let  ix  be  a  simple  term.  Suppose  p  =  q  is  a  valid  equation  such  that 
neither  p  nor  q  contains  an  occurrence  of  A  or  an  assignment  to  any  variable  in  i  x, 
x  G  V.  Then  the  following  is  also  a  valid  equation: 

f\  fr  ~f~  i  y  '■  P[-^-/-4  (i;r  )  |  X  G  V]  —  l  :r  f  l  y  '■  q  \x  j  A(ir  )  X  G  1  ]. 

x,y£V  x,y£V 

x^y  x^y 

Proof.  For  any  expression  e,  abbreviate  e[x/A(ix)  x  G  V’]  by  e'.  The  atomic 
instructions  of  p'  and  q'  are  all  of  the  form  y  :=  t'  or  A(iy)  :=  t' .  Because  the  ix  are 
simple  and  neither  p  nor  q  (therefore  neither  p'  nor  q')  assign  to  any  variable  of  i x ,  none 
of  these  atomic  instructions  can  change  the  value  of  i  x .  Thus  the  atomic  instructions  of 
p'  and  q'  commute  with  the  precondition  /\x  yeV  xjty  ix  ^  iy.  By  an  inductive  argument 
involving  (6),  so  do  all  subprograms  of  p'  and  q'.  Thus  the  A(ix),  x  G  V ,  behave  like 
fixed  and  distinct  individual  variables  throughout  the  computation  of  p '  and  q'.  □ 

Applying  Theorem  2.1  to  the  axioms  (9)— (17)  of  SKAT  gives  corresponding  axioms 
that  apply  to  arrays.  For  example,  the  following  conditions  are  consequences  of  Theorem 
2.1  applied  to  (9).  These  equations  hold  under  the  assumptions  of  Theorem  2.1  and  in 
the  presence  of  the  implicit  precondition  i  x  f  iy.  For  any  expression  e,  let  ex  and  exy 


abbreviate  e[;r/,4(F,.)]  and  e[x/A(ix),y/A(iy)\,  respectively. 

^4(‘t,r)  • —  sx  ,  A(iy)  . —  txy  =  A(iy)  . —  ty\x / sx]  ,  A(ix)  . —  sx  (20) 

x  :=  s  ;  A(iy)  :=  ty  =  A(iy)  :=  ty[x/s]  ;  x  :=  s  (21) 

^4(kr)  • —  sx  ;  y  . —  tx  =  y  . —  t[xjsx]  ,  ,4(iiT)  . —  sx  (22) 

where  y  f  FV(.s).  Other  axioms  for  arrays  obtained  similarly  are 

^4(kr)  • —  sy  ,  ,4((.y)  . —  txy  =  A(ix)  . —  sy  ;  ,4(4^)  . —  ty\x/ (23) 
A(ix)  :=  sx  ;  A(ix)  :=tx  =  A(ix)  :=  t[x/sx ]  (24) 

ply/ty\  !  ■'^■(*2/)  :=  ty  =  A  (iy)  :=  ty  ;  ipy  (25) 

where  x  ^  FV(s)  in  (23). 


More  general  versions  hold,  but  these  are  sufficient  for  our  purposes,  so  we  leave  a 
more  thorough  analysis  for  future  work.  We  will  take  these  conditions  as  axioms  when 
reasoning  in  the  presence  of  arrays. 
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do  j  =  1 , N- 1 
B1 ( j ) :  //swap 
tmp  =  A  ( j  )  ; 

A(j)  =  A  (  j  +1 )  ; 

A  ( j  + 1 )  =  tmp ; 

B2  ( j )  :  //update 
do  i  =  j  + 1 ,  N 

A(i)  =  A  ( i )  /A  ( j  ) 


do  j  =  1 , N- 1 
B1 ( j ) :  / /swap 

tmp  =  A ( j ) ; 

A ( j )  =  A ( j  +1 )  ; 

A  ( j  + 1 )  =  tmp ; 
do  j  =  1 , N- 1 

B2  ( j )  :  //update 
do  i  =  j  + 1 ,  N 

A ( i )  =  A ( i ) /A ( j ) ; 


(a)  Original  Program  (b)  Transformed  Program 


Figure  1:  Loop  Distribution  Example  from  [16] 


3  Loop  Distribution — A  Simplified  Example 

The  transformation  shown  in  Figure  1  is  from  [16].  This  is  a  simplified  version  of  LU 
factorization  with  partial  pivoting  used  to  illustrate  various  aspects  of  their  technique.  As 
they  describe  it: 

The  source  program  of  Figure  1(a)  traverses  an  array  A ;  at  the  j th  iteration,  it 
swaps  elements  A(j)  and  A(j  -f  1),  and  updates  all  the  elements  from  A(j  + 

1)  through  A(Ar)  using  the  new  value  in  A(j).  This  is  a  much  simplified 
version  of  LU  factorization  with  partial  pivoting  in  which  entire  rows  of  a 
matrix  are  swapped  and  entire  submatrices  are  updated  at  each  step  . . . 

Loop  distribution  transforms  this  program  into  the  one  shown  in  Figure  1(b). 

In  this  program,  all  the  swaps  are  done  first,  and  then  all  the  updates  are 
done  together.  This  transformation  is  useful  because  the  second  loop  nest  is 
perfectly  nested  and  can  be  tiled  to  get  good  locality  of  reference.  Are  these 
programs  equal? 

Dependence  analysis  requires  that  there  not  be  a  dependence  from  an  in¬ 
stance  B2(j-2)  to  an  instance  Bl(ji)  where  j i  >  jo.  Unfortunately,  this 
condition  is  violated:  instance  B2(jo )  writes  to  location  ,4(jo  +  1),  and  in¬ 
stance  Bl(  jo  +  1)  reads  from  it.  Symbolic  analysis  of  these  programs  on  the 
other  hand  is  too  difficult.  [16] 

The  remainder  of  this  article  is  devoted  to  giving  a  formal,  purely  equational  proof  of 
equivalence  using  KAT. 

Let  swap(j)  denote  the  three-line  subprogram  labeled  Bl(j)  and  let  update(j)  de¬ 
note  the  two-line  subprogram  labeled  B2(j )  in  Figure  1.  Let  u(i,  j)  denote  the  array 
assignment  A(i)  :=  A(i) /A(j),  Expressed  in  the  language  of  KAT, 

swap(j)  =  tmp  :=  A(j)  ;  A(j)  :=  A(j  +  1) ;  A(j  +  1)  :=  tmp 
update(j)  =  i  :=  j  +  1 ;  (i  <  N  ;  u(j,  j)  ;  i++)*  ;  i  >  N, 
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and  the  programs  of  Figure  1(a)  and  (b)  are 


j  '■=  1 5  ( 3  <  N  ;  swap(j) ;  update)  j) ;  j++)*  ;  j  >  N  (26) 

j  ■=  1 ;  (. j  <  N  ;  swap(j)  ;  j++)*  ;  j  >  N ; 

j  ■=  1 ;  U  <  N  ;  update(j)  ;  j++ )*  ;  j  >  N,  (27) 

respectively.  We  wish  to  show  that  (26)  and  (27)  are  equivalent. 

Lemma  3.1  Let  a,  b,  t  be  distinct  variables.  Let  f(x)  be  a  term  with  a  variable  x  but  no 
occurrence  of  a,  b,  or  t.  The  following  two  schemes  are  equivalent: 

a  :=  f(a )  ;  b  :=  f(b )  ;  t  :=  a  ;  a  :=  b  j  b  :=  t ;  t  :=  _L  (28) 

t  :=  a  ;  a  :=  b;  b  :=  t;  t  ;=  _L  ;  a  :=  /(a)  ;  b  :=  f(b).  (29) 

Proof  Starting  from  (28),  we  can  move  b  :=  f{b)  right  past  the  next  two  assignments 
using  (13)  and  (9),  then  annihilate  it  using  (1 1)  to  obtain 

a  :=  f(a )  ;  t  :=  a  ;  a  :=  f(b)  |  b  :=  t ;  t  :=  _L. 

Similarly,  we  can  move  a  :=  f(a)  right  past  the  next  assignment  using  (9),  then  annihilate 
it  using  (1 1)  to  obtain 


t  ■=  f(a)  ;  a  :=  f(b ) ;  b  :=  t ;  t  :=  _L. 


Applying  (1 1)  in  the  right-to-left  direction  to  t  :=  f  (a),  we  obtain 

t  :=  a;  t  :=  f(t )  ;  a  :=  /(&)  ;  b  :=  t ;  t  :=  ±. 

Now  we  can  move  t  :=  f{t )  right  past  the  next  two  assignments  using  (13)  and  (9),  then 
annihilate  it  using  (11)  to  obtain 

t  ■=  a  ;  a  :=  f(b )  ;  b  :=  f(t )  ;  f  :=  A.  (30) 

Starting  from  (29),  we  can  apply  (13)  three  times  to  move  t  :=  ±  all  the  way  to  the 
right  and  exchange  b  :=  t  and  a  :=  f(a )  to  obtain 

t  :=  a;  a  :  b :  a  :=  f(a )  ;  b  :=  t ;  b  :=  /(&)  ;  t  :=  _L. 

Now  two  independent  applications  of  ( 1 1)  yield  (30).  □ 

Lemma  3.2  Let  j,  Ai,  jV  foe  distinct  variables.  The  following  programs  are  equivalent: 


j  <  k  <  N  ;  update(j)  ;  swap(fo) 
j  <  k  <  N  :  swap(fc)  ;  update(j). 


(31) 

(32) 


Proof.  Let  q  abbreviate  the  program  u (i,j)  ;  i++  and  let  w  abbreviate  the  program 
i  :=  j  +  1.  First  we  show  that  under  the  precondition  j  <  k  <  N,  we  can  decompose 
update(j)  as  follows: 

j  <  k  <  N  ;  update(j) 

=  j  <  k  <  N  ;  w  ;  (i  <  k  ;  q)*  ; 
i  =  k;  u  (k,j)  ;  u  (k  +  1  ,j)  ; 

i  :=  k  +  2  ;  (i  <  N  ;  q)*  ;  i  >  N.  (33) 

To  see  this,  first  note  that 
update(j) 

=  w ;  (i  <  N  ;  q)*  ;  i  >  N 

=  w  ;  (i  <  N  %  (i  <  k  ©  i  >  k)  ;  q)*  |  i  >  N 

=  w  ;  ((i  <  N  ;  i  <  k  ;  q)  8  (i  <  N  ;  i  >  k  ;  q))*  ;  i  >  N 

=  w  ;  (i  <  N  j  t  <  k  ;  q)*  ;  (i  <  N  ; >  k  ;  q)*  ;  i  >  N.  (34) 

The  last  step  follows  from  (4).  The  precondition  of  (4)  is 

i  <  N  ;  *  >  k  ;  q  ;  i  <  N  ;  i  <  k  ;  q  =  0.  (35) 

To  see  (35),  note  that  i  >  k  and  u (i,j)  commute  by  (14)  (amended  by  Theorem  2.1  to 
handle  arrays),  and  i  <  k  and  i  <  N  commute  by  Boolean  algebra.  It  thus  suffices  to 
show  i  >  k  ;  t++  ;  i  <  k  =  0.  But  this  is  immediate  from  (12)  and  number  theory. 

By  Boolean  algebra  and  (12),  we  have 

j  <  k  <  N  ;  w  =  j  <  k  <  N ;  w ;  i  <  k  <  N-, 

thus  by  (34), 

j  <  k  <  N  ;  update(j) 

=  j  <  k  <  N  ;  w  ;  i  <  k  <  N ; 

(i  <  N  ;  i  <  k  ;  q)*  ;  (i  <  N  ;  i  >  k  ;  q)*  ; 
i  >  N.  (36) 

Since  k  <  N  and  i  <  k  imply  i  <  TV,  by  (6)  we  have 

i  <  k  <  N  ;  (*  <  N  ;  i  <  k  ;  q)*  =  i  <  k  <  N  ;  (i  <  k  ;  q)*. 

By  (6)  and  (18),  this  is  equivalent  to 

i  <  k  <  N  ;  (i  <  k  ;  q)*  ;  i  <  k  <  N.  (37) 

We  also  have 

i  <  k  <  N  ;  (i  <  N  j  i>k  \  q)*  ;  i  >  N 

=  i  <  k  <  N  ;  *  <  N  ;  i  >  k  ;  q  ;  (i  <  N  ;  i  >  k  ;  q)*  ;  i  >  N 

0  -i  <k  <  N  ;  i>  N 
=  0, 
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therefore  by  (6), 


i  <  k  <  N ;  (i  <  N  ;  i  >  k ;  q)*  ;  i  >  N 
=  i  =  k  <  N  ;  (i  <  N  ;  *  >  k  ;  q)*  ;  i  >  N 
=  i  k  <  X  :  (i  <  N  ;  q)*  ;  i  >  N.  (38) 

Combining  (36),  (37),  and  (38),  we  have 

j  <  k  <  N  ;  update(j) 

=  j  <  k  <  N  ;  w ; 

i  <  k  <  /V  :  (i  <  k\  q)*  ; 

i  =  k  <  N  ;  (i  <  N ;  q)*  ;  i  >  N.  (39) 

Let  s  =  i  <  N  ;  q,  the  body  of  the  last  loop  in  (39).  Unrolling  this  loop  twice,  the  last 
line  of  (39)  becomes 

i  =  k  <  N  ;  s*  ;  i  >  N 

=  i  =  k<N]i>N  0  *  =  k  <  N  ;  s  f  i  >  N  ©  i  =  k  <  N  ;  s  ;  s  ;  s*  ;  i  >  N 

and  the  first  two  terms  vanish.  Also,  an  elementary  argument  using  (12),  (16),  and  number 
theory  yields 

i  =  k  <  N  ;  s  ;  s  =  i  =  k  <  N  ;  u(k,  j)  ;  u(k  +  1,  j)  j;  i  i—  k  +  2. 
Combining  these  observations  with  (39)  gives 

j  <  k  <  N  ;  update(j) 

=  j  <  k  <  N  ;  w ; 

i  <  k  <  N  ;  (i  <  k\  q)*  ; 

i  =  k  <  N  ;  u (k,j)  ;  u (k  +  1  ,j)  ;  i  :=  k  +  2  ;  s*  \  i  >  N.  (40) 
Thus  the  program  (31)  is  equivalent  to 
j <k <N ; 

w ;  i  <  k  <  N  ;  (i  <  k  ;  q)*  ; 
i  =  k  <  N  ;  u (k,j)  ;  u(/,-+  1,  j) ; 
i  :=  k  +  2  ;  (  j  <  iV  ;  q)*  ;  i  >  N  ; 
swap  (fc), 

and  swap  (A')  commutes  with  each  of  the  three  lines  above  it  by  (13),  Lemma  3.1,  and 
(13),  respectively  (amended  by  Theorem  2.1  to  handle  arrays).  The  final  result  is  (32). 

□ 

Lemma  3.3  Let  j ,  k ,  N  be  distinct  variables.  The  following  programs  are  equivalent: 

j  <  k  ;  j  <  N  ;  update(j)  ;  j++  ;  k  <  N  ;  swap(fc)  ;  &++  (41) 

j  <  k;  k  <  N  ;  swap  (&)  ;  fc++  ;  j  <  AT ;  update(j)  ;  j++.  (42) 
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Proof.  By  (13)  and  (14),  these  programs  are  equivalent  to 

j  <  k  ;  k  <  N  ;  j  <  N  ;  update(j)  ;  swap (k)  ;  k++  :  j++ 

j  <  k  ;  k  <  N  ;  j  <  N  ;  swap (k)  ;  update(j)  ;  k++  :  j++, 

respectively.  The  equivalence  of  these  two  programs  follows  immediately  from  Lemma 
3.2.  □ 


Lemma  3.4  Let  p .  q  be  program  symbols  and  e,  c,  d  test  symbols.  Under  the  assumptions 


the  following  equation  holds: 


e(pq)*cd 


epq 

=  pqe 

(43) 

dp 

=  pd 

(44) 

cq 

=  qc 

(45) 

ec 

=  ed 

(46) 

cp 

=  0 

(47) 

dq 

=  o, 

(48) 

_ 

e(pq)*(p*  +  q*)cd. 

(49) 

Proof.  It  follows  from  (46)  and  Boolean  algebra  that  ec  =  ed.  Then 


e(pq)*q*c  =  (pq)*e(l  +  qq*)c 
=  (pq)*  (ec  +  eqq*c) 

=  (pq)*  (ec  +  ecqq*) 

=  (pq)*  (ec  +  edqq*) 

=  (pq)*ec 
=  e(pq)*c 


by  (43)  and  (6) 
by  (45),  using  (6)  and  (8) 
by  (48) 

by  (43)  and  (6). 


A  symmetric  argument  using  (44)  and  (47)  shows  that  e(pq)  *p*d  =  e(pq)*d.  The  equa¬ 
tion  (49)  follows  immediately  from  these  two  equations.  □ 


Lemma  3.5  Let  p,  q  be  program  symbols  and  a,  b  test  symbols.  Under  the  assumptions 


bp  =  pa  (50) 

aq  =  qb  (51) 

ap  =  apa  (52) 

apq  =  aqp,  (53) 

the  following  equations  hold: 

ap*q  =  aqp*  (54) 

a(qp)*(p*+q*)  =  ap*q*  (55) 

b(pq)*(p*+q*)  =  bp*q* .  (56) 
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Proof.  We  first  show  that  (54)  follows  from  (52)  and  (53).  For  the  direction  <,  using 
(53)  we  have 

aq  4-  apaqp*  <  aq+apqp*  =  aq  T aqpp*  =  aqp*, 

therefore  (ap)*aq  <  aqp*  by  an  axiom  of  Kleene  algebra.  Then  by  (52),  (7),  and  (2), 

ap*q  =  a(pa)*q  =  (ap)*aq  <  aqp*. 

For  the  reverse  inequality,  by  (7)  and  (53), 

aq  + ap*qp  =  aq+ap*aqp  =  aq  +  ap*apq  =  aq+ap*pq  =  ap*q, 

therefore  aqp*  <  ap*q  by  an  axiom  of  Kleene  algebra. 

It  follows  from  (50)  and  (51)  using  (6)  that  aqp  =  qpa  and  a(qp)  *  =  (qp)*a.  Also, 
by  (52)  and  (7),  ap*  =  ap*a. 

For  the  direction  <  of  (55),  we  must  show 

a(qp)V  <  ap*q*  (57) 

a(qp)*q*  <  ap*q*.  (58) 

For  (57),  by  (53),  (52),  and  (54),  we  have 

ap*+qpap*q*  =  ap*  +  aqpp*q*  =  ap*+apqp*q* 

=  ap*  + apaqp*q*  =  ap*  + apap*qq*  <  ap*q*. 

By  (6)  and  an  axiom  of  Kleene  algebra, 

a(qp)*p*  =  (qp)*ap*  <  ap*q*. 

This  is  (57).  Equation  (58)  follows,  since 

a(qp)*q*  <  a(qp)*p*q*  <  ap*q*q*  <  ap*q*. 

For  the  direction  >  of  (55),  by  an  axiom  of  Kleene  algebra  it  suffices  to  show 

ap*+a(qp)*(p*+q*)q  <  a(qp)*(p*  +  q*). 

This  follows  from  the  four  inequalities 

ap*  <  a(qp)*(p*+q*) 

a(qp)*q  <  a(qp)*(p*  +  q*) 
a(qp)*pp*q  <  a(qp)*(p*  +  q*) 
a(qp)*q*q  <  a(qp)*(p*  +  q*), 

of  which  all  but  the  third  are  obvious.  For  the  third,  we  use  (6),  (52),  (53),  and  (54): 

a(qp)*pp*q  =  (qp)*app*q  =  (qp)*apap*q  =  (qp)*apaqp* 

=  (qp)*apqp*  =  (qp)*aqpp*  =  a(qp)*qpp*  <  a(qp)*p*. 
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Finally,  for  (56),  we  have  by  (2),  (6),  (54),  and  (55)  that 

aq(pq)*(p*  +  q*)  =  a(qp)*q(p*  +  q*)  =  (qp)*aq(p*  +  q*) 

=  (qp)*a(p* +q*)q  =  a(qp)*(p*  +  q*)q  =  ap*q*q, 

thus  by  (50), 

bpq(pq)*(p*  +q*)  =  paq(pq)*(p*  +  q*)  =  pap*qq*  =  bpp*qq*. 

It  follows  that 

b(pq)*(p*  +  q*)  =  bp*+bq*  +  bpq(pq)*(p*+q*) 

=  bp*+bq*  +  bpp*qq* 

=  bp*q*. 

□ 

We  are  now  ready  to  prove  our  main  theorem. 

Theorem  3.6  The  following  two  programs,  with  k  :=  _L  implicitly  appended,  are  equiv¬ 
alent: 

j  ■=  1 ;  (j  <  N  ;  swap(j)  ;  update(j)  ;  j++ )*  ;  j  >  N  (59) 

k  :=  1  ;  (k  <  N  ;  swap(fc)  ;  k++ )*  ;  k  >  N ; 

j  ■=  1 ;  (j  <  N  ;  update(j)  ;  j++ )*  ;  j  >  N.  (60) 

Proof.  Under  the  abbreviations 


a 

def 

j  <k 

P 

def 

k<N ; 

swap(&)  ;  k++ 

b 

def 

j  <k 

q 

def 

j  <  N  ■ 

update(j) ;  j++ 

c 

def 

k  <  N 

r 

def 

k  :=  1 

d 

def 

J  <  N 

s 

def 

j  ■=  1, 

def 

j  =  k 

e 

— 

all  the  premises  (43)-(48)  and  (50) — (53)  of  Lemmas  3.4  and  3.5  hold.  These  facts  are  all 
immediate  except  (53),  which  is  Lemma  3.3.  Program  (59)  is 

s ;  (d  ;  swap(j)  ;  update(j)  ;  j++ ■)*  ;  d. 

Because  k  :=  _L  is  implicitly  appended,  by  [1,  Lemma  4.5],  this  is  equivalent  to 

rs ;  (d  ;  swap(j)  ;  update(j)  ;  k  :=  j  +  1  ;  j++)*  ;  d. 

By  two  applications  of  (12),  the  assignments  k  :=  1 ;  j  :=  1  establish  the  property  j  =  k; 
thus  rs  =  rse.  Since  e  commutes  with  d  ;  swap(j)  ;  update(j)  and  with  k  :=  j  + 1 ;  j++, 
by  (6)  we  obtain 

rse  ;  (de  ;  swap(j)  ;  update(j)  ;  e  ;  k  :=  j  +  1 ;  j++)*  ;  de 
=  rse  ;  (cde  ;  swap(j)  ;  update(j)  ;  e  ;  k  :=  j  +  1  ;  j++)*  ;  cde. 
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By  two  applications  of  (16),  this  is  equivalent  to 


rse  ;  (cde  ;  swap(fc) ;  update(j)  ;  e  ;  k++  :  j++)*  ;  cde. 

Removing  e  again  by  (6),  then  using  commutativity,  this  is  equivalent  to 

rs  ;  (c  ;  swap (k) ;  k++  ;  d  ;  update(j)  ;  j++)*  ;  cd 
=  rs(pq)*cd. 

Applying  Lemmas  3.4  and  3.5  and  some  obvious  commutativity  conditions,  we  obtain 

rs(pq)*cd  =  rse(pq)*cd  =  rse(pq)*(p*  +  q*)cd 

=  rseb(pq)*(p*  +  q*)cd  =  rsebp*q*cd  =  rsp*q*cd 
=  rp*csq*d, 

which  is  (60).  □ 


Acknowledgements 

This  work  was  supported  in  part  by  NSF  grant  CCR-0105586  and  by  ONR  Grant  N00014- 
01-1-0968.  The  views  and  conclusions  contained  herein  are  those  of  the  authors  and 
should  not  be  interpreted  as  necessarily  representing  the  official  policies  or  endorsements, 
either  expressed  or  implied,  of  these  organizations  or  the  US  Government. 


References 

[1]  Allegra  Angus  and  Dexter  Kozen.  Kleene  algebra  with  tests  and  program  schematology. 
Technical  Report  2001-1844,  Computer  Science  Department,  Cornell  University,  July  2001. 

[2]  Ernie  Cohen.  Lazy  caching.  Unpublished,  1994. 

[3]  Ernie  Cohen.  Using  Kleene  algebra  to  reason  about  concurrency  control.  Unpublished,  1994. 

[4]  Ernie  Cohen,  Dexter  Kozen,  and  Frederick  Smith.  The  complexity  of  Kleene  algebra  with 
tests.  Technical  Report  96-1598,  Computer  Science  Department,  Cornell  University,  July 
1996. 

[5]  John  Horton  Conway.  Regular  Algebra  and  Finite  Machines.  Chapman  and  Hall,  London, 
1971. 

[6]  R  Downey  and  R.  Sethi.  Assignment  commands  with  array  references.  J.  Assoc.  Comput. 
Mach.,  25(4):652-666,  October  1978. 

[7]  Michael  J.  Fischer  and  Richard  E.  Ladner.  Propositional  dynamic  logic  of  regular  programs. 
J.  Comput.  Syst.  Sci.,  1 8(2):  194-2 1 1 ,  1979. 

[8]  David  Harel,  Dexter  Kozen,  and  Jerzy  Tiuryn.  Dynamic  Logic.  MIT  Press,  Cambridge,  MA, 

2000. 

[9]  Stephen  C.  Kleene.  Representation  of  events  in  nerve  nets  and  finite  automata.  In  C.  E. 
Shannon  and  J.  McCarthy,  editors,  Automata  Studies,  pages  3^H.  Princeton  University  Press, 
Princeton,  N.J.,  1956. 


14 


[10]  Dexter  Kozen.  A  completeness  theorem  for  Kleene  algebras  and  the  algebra  of  regular  events. 
Infor.  and  Comput.,  110(2):366-390,  May  1994. 

[11]  Dexter  Kozen.  Kleene  algebra  with  tests.  Transactions  on  Programming  Languages  and 
Systems ,  19(3):427~443,  May  1997. 

[12]  Dexter  Kozen.  On  Hoare  logic  and  Kleene  algebra  with  tests.  Trans.  Computational  Logic, 
l(l):60-76,  July  2000. 

[13]  Dexter  Kozen  and  Maria-Cristina  Patron.  Certihcation  of  compiler  optimizations  using 
Kleene  algebra  with  tests.  In  John  Lloyd,  Veronica  Dahl,  Ulrich  Furbach.  Manfred  Ker- 
ber,  Kung-Kiu  Lau,  Catuscia  Palamidessi,  Luis  Moniz  Pereira,  Yehoshua  Sagiv,  and  Peter  J. 
Stuckey,  editors,  Proc.  lstlnt.  Conf.  Computational  Logic  (CL2000),  volume  1861  of  Lecture 
Notes  in  Artificial  Intelligence,  pages  568-582,  London.  July  2000.  Springer- Verlag. 

[14]  Dexter  Kozen  and  Frederick  Smith.  Kleene  algebra  with  tests:  Completeness  and  decidability. 
In  D.  van  Dalen  and  M.  Bezem,  editors,  Proc.  10th  Int.  Workshop  Computer  Science  Logic 
(CSL’96),  volume  1258  of  Lecture  Notes  in  Computer  Science,  pages  244-259,  Utrecht,  The 
Netherlands,  September  1996.  Springer- Verlag. 

[15]  Dexter  Kozen  and  Jerzy  Tiuryn.  On  the  completeness  of  propositional  Hoare  logic.  In  J.  De- 
shamais,  editor,  Proc.  5th  Int.  Seminar  Relational  Methods  in  Computer  Science  (RelMiCS 
2000),  pages  195-202,  January  2000. 

[16]  Nikolay  Mateev,  Vijay  Menon,  and  Keshav  Pingali.  Fractal  symbolic  analysis.  In  Proc.  15th 
Int.  Conf.  on  Supercomputing,  pages  38 — 49.  ACM,  ACM  Press,  2001. 

[17]  G.  Nelson  and  D.  Oppen.  Simplification  by  cooperating  decision  procedures.  Trans.  Pro¬ 
gramming  Languages  and  Systems,  1(21:245-257,  1979. 

[18]  Aaron  Stump,  Clark  W.  Barrett,  David  L.  Dill,  and  Jeremy  Levitt.  A  decision  procedure  for 
an  extensional  theory  of  arrays.  In  16th  Symp.  Logic  in  Comput.  Sci.,  pages  29-37 .  IEEE, 
June  2001. 

[19]  N.  Suzuki  and  D.  Jefferson.  Verification  decidability  of  presburger  array  programs.  In  Proc. 
Conf.  Theor.  Comput.  Sci.  University  of  Waterloo,  1977. 


15 


